Dynamic Reward-Based Dueling Deep Dyna-Q: Robust Policy Learning in Noisy Environments
نویسندگان
چکیده
منابع مشابه
Learning Robust Dialog Policies in Noisy Environments
Modern virtual personal assistants provide a convenient interface for completing daily tasks via voice commands. An important consideration for these assistants is the ability to recover from automatic speech recognition (ASR) and natural language understanding (NLU) errors. In this paper, we focus on learning robust dialog policies to recover from these errors. To this end, we develop a user s...
متن کاملGeometric Concept Acquisition in a Dueling Deep Q-Network
Explaining how intelligent systems come to embody knowledge of deductive concepts through inductive learning is a fundamental challenge of both cognitive science and artificial intelligence. We address this challenge by exploring how a deep reinforcement learning agent, occupying a setting similar to those encountered by early-stage mathematical concept learners, comes to represent ideas such a...
متن کاملDueling Network Architectures for Deep Reinforcement Learning
In recent years there have been many successes of using deep representations in reinforcement learning. Still, many of these applications use conventional architectures, such as convolutional networks, LSTMs, or auto-encoders. In this paper, we present a new neural network architecture for model-free reinforcement learning inspired by advantage learning. Our dueling architecture represents two ...
متن کاملRobust benchmarking in noisy environments
We propose a benchmarking strategy that is robust in the presence of timer error, OS jitter and other environmental fluctuations, and is insensitive to the highly nonideal statistics produced by timing measurements. We construct a model that explains how these strongly nonideal statistics can arise from environmental fluctuations, and also justifies our proposed strategy. We implement this stra...
متن کاملEvolutionary Agent-based Policy Analysis in Dynamic Environments Evolutionary Agent-based Policy Analysis in Dynamic Environments
Evolutionary algorithms (EAs) form a rich class of stochastic search methods that use the Darwinian principles of variation and selection to incrementally improve a set of candidate solutions (Eiben and Smith, 2003; Jong, 2006). Both principles can be implemented from a wide variety of components and operators, many with parameters that need to be tuned if the EA is to perform as intended. Tuni...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the AAAI Conference on Artificial Intelligence
سال: 2020
ISSN: 2374-3468,2159-5399
DOI: 10.1609/aaai.v34i05.6516